[SPARK-5363] fix bug: remove() inside iterator is not safe#4776
[SPARK-5363] fix bug: remove() inside iterator is not safe#4776davies wants to merge 1 commit intoapache:masterfrom
Conversation
|
cc @JoshRosen |
|
Test build #27969 has started for PR 4776 at commit
|
|
LGTM pending Jenkins. |
|
Context for other reviewers: removing elements from a mutable HashSet while iterating over it can cause the iteration to skip over entries that weren't removed. In this case, this would cause PythonRDD to write fewer than |
|
Test build #27969 has finished for PR 4776 at commit
|
|
Test PASSed. |
|
I'm going to merge this into |
Removing elements from a mutable HashSet while iterating over it can cause the iteration to incorrectly skip over entries that were not removed. If this happened, PythonRDD would write fewer broadcast variables than the Python worker was expecting to read, which would cause the Python worker to hang indefinitely. Author: Davies Liu <davies@databricks.com> Closes #4776 from davies/fix_hang and squashes the following commits: a4384a5 [Davies Liu] fix bug: remvoe() inside iterator is not safe (cherry picked from commit 7fa960e) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Removing elements from a mutable HashSet while iterating over it can cause the iteration to incorrectly skip over entries that were not removed. If this happened, PythonRDD would write fewer broadcast variables than the Python worker was expecting to read, which would cause the Python worker to hang indefinitely. Author: Davies Liu <davies@databricks.com> Closes #4776 from davies/fix_hang and squashes the following commits: a4384a5 [Davies Liu] fix bug: remvoe() inside iterator is not safe (cherry picked from commit 7fa960e) Signed-off-by: Josh Rosen <joshrosen@databricks.com>
During iterating, it's not safe to remove item.